Training apparatus and method

ABSTRACT

Training apparatus for training a user to engage in transactions (e.g. a foreign language conversation) with another person whom the apparatus is arranged to simulate, the apparatus comprising:  
     an input for receiving input dialogue from a user;  
     a lexical store containing data relating to individual words of said input dialogue;  
     a rule store containing rules specifying grammatically allowable relationships between words of said input dialogue;  
     a transaction store containing data relating to allowable transactions between said user and said person;  
     a processor arranged to process the input dialogue to recognise the occurrence therein of words contained in said lexical store in the relationships specified by the rules contained in said rule store in accordance with the data specified in the transaction store, and to generate output dialogue indicating when correct input dialogue has been recognised; and  
     an output device for making the output dialogue available to the user.

[0001] This invention relates to apparatus and methods for training;particularly, but not exclusively, for language training.

[0002] In language training, various different skills may be developedand tested. For example, our earlier application GB 2242772, disclosesan automated pronunciation training system, in some respects improvingupon the well known “language laboratory” automated test equipment.

[0003] Training and dialogue is carried out by human teachers who areexperienced in the target language (i.e. the language to be learnt). Insuch training, the teacher will understand what is being said, even whenthe grammar is imperfect, and can exercise judgment in indicating when aserious or trivial mistake is made, and in explaining what the correctform should be.

[0004] Ultimately, it may become possible to provide a computer whichwould duplicate the operation of such a language teacher, in properlycomprehending the words of a student, carrying out a full dialogue, andindicating errors committed by the student. However, although the fieldsof artificial intelligence and machine understanding are advancing, theyhave not as yet reached this point.

[0005] EP-A-0665523 briefly discloses a foreign language skillsmaintenance system, in which role playing is permitted, comprising aninput for receiving input dialogue from a user and an output at whichthe “correct” dialogue which would be anticipated from the user isdisplayed, for comparison with the input dialogue by the user (or by thecomputer).

[0006] An object of the present invention is to provide a trainingsystem (particularly for language training but possibly applicable morewidely) which utilises limited volumes of memory to store limitednumbers of words and grammatical data, but is nonetheless capable ofrecognising input language errors and of carrying on a dialogue with astudent.

[0007] Aspects of the invention are defined in the appended claims.

[0008] In an embodiment, the present invention provides a display of aperson, and is arranged to vary the display to have differentexpressions, corresponding to comprehension, and at least one degree ofincomprehension. Preferably, two degrees of incomprehension areprovided; one corresponding to an assumed error in an otherwisecomprehensible input and the other corresponding to incomprehensibleinput.

[0009] In an embodiment, a display is provided which indicates targetlanguage responses generated by the invention, together with text(preferably in the target language) indicating the level ofcomprehension achieved. Thus, an error is indicated without interruptingthe target language dialogue.

[0010] Preferably, in an embodiment, the invention provides for thegeneration of source language text for the guidance of the student.Preferably, the source language text is normally hidden and is displayedon command by the user.

[0011] Very preferably, the source language text comprises guidance asto what the last target language output text means.

[0012] Very preferably, the guidance text comprises an explanation ofwhat any detected error is assumed to be.

[0013] Very preferably, the guidance text comprises text indicating whatsuitable next responses by the student might be.

[0014] Alternatively, the invention may comprise speech recognitionmeans for the input of speech and/or speech synthesis means for thegeneration of speech, to replace input and/or output text in the aboveembodiments.

[0015] Preferably, the invention comprises a terminal for use by thestudent at which input is accepted and output is generated, and a remotecomputer at which the processing necessary to convert each input fromthe user to corresponding outputs is performed, the two being linkedtogether by a telecommunications channel. This arrangement permits theprocessing resources required to be centralised, rather than requiringthem to be present for each user (language student). It also providesfor effective use of the telecommunications channel, since much of thetraffic is relative low bandwidth text information.

[0016] Preferably, in this embodiment, the telecommunications channelcomprises the network of high bandwidth links interconnecting computersites known as the “Internet”. Where this is the case, the invention mayconvenient be realised as a mobile program (“applet”) which isdownloaded initially, and operates with conventional residentcommunications programs referred to as “HTML browsers”.

[0017] In an embodiment, the invention operates by reference to datarelating to words, and data relating to grammatical rules.

[0018] This enables far greater range of input and output dialogue, forthe same memory usage, than direct recognition and/or generation ofdialogue phrases.

[0019] The presence of errors may be detected by providing a first setof rules which are grammatically correct, and associated with each ofthe first set, a respective second set of rules each of which relaxes aconstraint of the respective first rule to which it relates. Input textis then parsed by using rules of the first set and, at least where thisis unsuccessful, rules of the second sets; where text is successfullyparsed by a rule of the second set but not by the first set rule towhich that second set relates, the error determined to be present isthat corresponding to the constraint which was relaxed in the rule ofthe second set.

[0020] Aspects of the invention will now be illustrated, by way ofexample only, with reference to the accompanying drawings in which:

[0021]FIG. 1 is a block diagram showing schematically the apparatus ofan embodiment of the invention;

[0022]FIG. 2 is a block diagram showing in greater detail the structureof a user interface terminal forming part of FIG. 1;

[0023]FIG. 3 is an illustrative diagram of the display shown on adisplay device forming part of the terminal of FIG. 2;

[0024]FIGS. 4a-4 d are exemplary displays shown on the display of FIG.3;

[0025]FIG. 5 is a block diagram showing schematically the structure of ahost computer forming part of FIG. 1;

[0026]FIG. 6 is a flow diagram showing schematically the general processperformed by the user interface terminal of FIG. 2;

[0027]FIG. 7 illustrates the structure of a control message transmittedfrom the host computer of FIG. 5 to the user interface terminal of FIG.2;

[0028]FIG. 8 is a diagram showing schematically the contents of a storeforming part of the host computer of FIG. 5;

[0029]FIG. 9 (comprising FIGS. 9a-9 f) is a flow diagram showingschematically the process of operation of the host computer of FIG. 5.

[0030] Referring to FIG. 1, the system of a first embodiment of theinvention comprises a terminal 10 such as a personal computer connected,via a telecommunications link 12 such as a telephone line, to atelecommunications network 14 such as the Internet, which in turn isconnected to a host computer 20. Both the terminal 10 and the hostcomputer 20 are conveniently arranged to communicate in a common filetransfer protocol such as TCP/IP.

[0031] Referring to FIG. 2, the terminal 10 comprises a centralprocessing unit 102, a keyboard 104, a modem 106 for communication withthe telecommunications link 12, a display device 108 such as a CRT, anda store 1 10, schematically indicated as a single unit but comprisingread only memory, random access memory, and mass storage such as a harddisk. These are interconnected via a bus structure 112.

[0032] Within the store 110 is a frame buffer area, to which pixels ofthe display device 108 are memory mapped. The contents of the framebuffer comprise a number of different window areas when displayed on thedisplay device 108, as shown in FIG. 3; namely, an area 302 defining aninput text window; an area 304 carrying a visual representation of aperson; an area 306 defining an output text window; an area 308 defininga comprehension text window; an area 310 displaying a list of possibleitems; an area 312 defining a transaction result window; and an area 314defining a user guidance window. The CPU 102 is arranged selectively tohide the response guidance window 314, and to display an icon 315, theresponse guidance window being displayed only when the icon 315 isselected via the keyboard or other input device.

[0033]FIG. 4a illustrates the appearance of the display device 108 inuse; the response guidance display area 314 is hidden, and icon 315 isdisplayed.

[0034] Also stored within the store 110 are a set of item image datafiles, represented in a standardised format such as for example a .GIFor .PIC format, each being sized to be displayed within the transactionresult area 312, and a set of expression image data files definingdifferent expressions of the character displayed in the person area 304.Finally, data defining a background image is also stored.

[0035] Referring to FIG. 5, the host computer 20 comprises acommunications port 202 connected (e.g. a via an ISDN link) to theinternet 12; a central processing unit 204; and a store 206. Typically,the host computer 20 is a mainframe computer, and the store comprises alarge scale off line storage system (such as a RAID disk system) andrandom access memory.

[0036] Control and Communications

[0037] The terminal 10 and host computer 20 may operate underconventional control and communications programs. In particular, in thisembodiment the terminal 10 may operate under the control of a GUI suchas Windows (TM) and a Worldwide Web browser such as Netscape (TM)Navigator (TM) which is capable of receiving and running programs(“Applets”) received from the Internet 12. The host computer 20 mayoperate under the control of an operating system such as Unix (TM)running a Worldwide Web server program (e.g. httpd). In view of the wideavailability of such operating programs, further details are unnecessaryhere.

[0038] General Overview of System Behaviour

[0039] In this embodiment, the scenario used to assist in languagetraining is that of the grocer's shop selling a variety of foods.

[0040] The object of the present embodiment is to address input text inthe target language to the grocer. If the text can be understood as aninstruction to supply a type of item, this will be confirmed with visualfeedback of several types; firstly, a positive expression will bedisplayed on the face of the grocer (area 304); secondly, the requesteditem will appear in the grocery basket transaction area (area 312)displayed on the screen 108; and thirdly the instruction will beconfirmed by output text in the target language from the grocer (area306).

[0041] If the input text can be understood as an instruction to purchasean item, but contains recognised spelling or grammatical errors, visualfeedback of the transaction is given in the form of a confirmation ofwhat the understood transaction should be as output text, and thedisplay of the item in the grocery basket (area 312).

[0042] However, the existence of the error is indicated by the selectionof a negative displayed expression on the face of the grocer (area 304),and a general indication as to the nature of the error is given bydisplaying text in the target language in a window indicating thegrocer's thoughts (area 308).

[0043] This may be sufficient, taken with the user's own knowledge, toindicate to the user what the error is; if not, the user may selectfurther assistance, in which case user guidance text indicating in moredetail, in the source language, what the error is thought to be isdisplayed.

[0044] If the input text cannot be understood because one or more words(after spell correction) cannot be recognised, a negative expression isdisplayed in the face of the grocer (area 304) and output text in thetarget language is generated in the area 306 to question theunrecognised words.

[0045] If the words in the input text were all recognised but the textitself cannot be recognised for some other reason, then a negativeexpression is generated on the face of the grocer (304) and output textin the target language is generated in area 306 recording a failure tounderstand.

[0046] In such cases of complete lack of comprehension, a facialexpression differing from the partial incomprehension shown in FIG. 4cis selected for display.

[0047] Operation of Terminal 10

[0048] Referring to FIG. 6, to initiate use of the system, the user setsup a connection to the host computer 20 from the terminal 10 (step 402).In step 404, a program (applet) for controlling the display of the imagedata is downloaded.

[0049] The host computer 20 then downloads a file of data representingthe background image, a plurality of files of data representing thedifferent possible expressions of the grocer, and a plurality of filesof data representing all the items on sale, in step 406.

[0050] In step 408, initial control data is received from the computer20, in the form of a control data message 500 which, as shown in FIG. 7,comprises a target language output text string 506, corresponding towords to be spoken by the grocer and hence to be displayed in thedisplay area 306; a source language user guidance text string 514 to bedisplayed in the user guidance display area 314 if this is selected fordisplay by the user; one or more item symbols 512 which will cause theselection for display of the images of one or more items in the displayarea 312; an expression symbol 504 for selecting one of the downloadedexpression image files for display on the face of the grocer in thedisplay area 304; and a target language comprehension text string 508for display in the display area 308 to indicate what the grocer wouldunderstand by target language text input by a user as described below.

[0051] In the initial message transmitted in step 408, the item symbolfield 512 and comprehension text field 508 are both empty.

[0052] In step 410, the CPU 102, under control of the program downloadedin step 404, first loads the background image to the frame store withinthe storage unit 110, and then overwrites the areas 304, 306, and, whereapplicable, 312 and 314; by generating image data representing the textstrings and inserting it in the relevant windows 306, 308, 314; byselecting the facial expression image indicated by the expression symbol504 and displaying this in the upper area of the person display area304; and by selecting an item image indicated by the item symbol anddisplaying these in the area 312.

[0053] With the exception of the window 302 (which would at this stagebe empty), the appearance of the display unit 108 at this stage is asshown in FIG. 4a.

[0054] Thus, the background display consists of the display of all theitem images in the display area 310 together with a corresponding textlabel indicating, in each case, the item name; the display of the icon315 indicating tutorial assistance; the display of the figure of agrocer with one of the selected expressions; the display of a speechbubble containing the grocer's speech output 306; and the display of abasket 312 receiving items placed therein by the grocer in response toshopping instructions.

[0055] If, in step 412, an instruction to log off or exit is input bythe user, the process terminates. Otherwise, the CPU 102 scans thekeyboard 104 (step 414) for the input of a string of text terminated bya carriage return or other suitable character, which is displayed in theinput text display area 302 and, when input is complete, transmitted tothe computer 20 in step 416 via the modem and Internet 12.

[0056] In response to the transmission of input text in step 416, thecomputer 20 returns another control message 500 (received in step 418)and, in response thereto, the terminal returns 10 to step 410 to updatethe display to reflect the contents of the control message.

[0057] Thus, referring to FIG. 4b, the result of the input of the textstring shown in area 302 of FIG. 4a is to cause the display of the textmessage “Voila un kilo de pommes! Et avec ca?” in the output text area306 (this representing the contents of the field 506 of the receivedcontrol message).

[0058] Field 504 contains a symbol corresponding to a cheerful orpositive expression, and the corresponding bit map image is displayed inthe upper portion of field 304.

[0059] Field 512 contains a symbol indicating the appearance of an appleand accordingly this symbol is displayed in display area 312. No data iscontained in the comprehension text field 508. Data is contained in theuser guidance text field 514 but not displayed since the user has notselected the icon 315.

[0060] If, at this stage, the text input in step 414 is as displayed inthe field 302 of FIG. 4b (which contains the words “Trois cents grammesde beure”), the control data received in step 418 leads to the displayindicated in FIG. 4c.

[0061] In this case, the target language text indicated in the field 306(“Voila trois cents grammes de beurre! Et avec ca?”) indicates what thecorrect word is presumed to be, but the comprehension text field 508 ofthe received control message contains the target language text,displayed in field 308, “Erreur d'orthographe! ” in a “thinks bubble”representation to indicate the thoughts of the grocer.

[0062] The expression symbol field 504 contains a symbol causing thedisplay to a puzzled expression on the face of the grocer as shown infield 304. Since the transaction has been understood, the item (butter)is represented by a. symbol in the item symbol field 512 and displayedin the area 312.

[0063] If, at this stage, the user selects the icon 315 (e.g. by acombination of key strokes or by the user of a pointing device such as amouse) the contents of the user guidance (source language) text field514 are displayed in the display area 314 which is overlaid over thebackground display as shown in FIG. 4d. In this embodiment, the guidancetext contains three text fields; a first field 3 14 a indicatinggenerally, in the source language (e.g. English), what the words in thefield 306 mean; an error analysis display 314 b indicating, in thesource language (e.g. English), the meaning of the words in thecomprehension text field 308 and indicating what, in this case, thespelling error is assumed to be; and an option field 314 c containingtext listing the options for user input in response to the situation.

[0064] From the foregoing, the operation of the terminal 10 willtherefore be understood to consist of uploading input text to thecomputer 20; and downloading and acting upon control messages inresponse thereto from the computer 20.

[0065] Action of the Host Computer 20

[0066] The host computer 20 will be understood to be performing thefollowing functions:

[0067] 1. Scanning the input text to determine whether it relates to oneof the transactions (e.g., in this case, sale of one of a number ofdifferent items) in a predetermined stored list.

[0068] 2. Determining whether all the information necessary for thattransaction is complete. If so, causing the returned control message todisplay visual indications that this is the case. If not, causing thereturned control message to include output text corresponding to atarget language question designed to elucidate the missing information.

[0069] 3. Spell checking and parsing the input text for apparent errorsof spelling or grammar, and causing the returned control message toinclude the indicated errors.

[0070] 4. Generating the user guidance text indicating, in the sourcelanguage, useful information about the target language dialogue.

[0071] Because the number of transactions to be detected is relativelysmall in number, the computer 20 does not need to “understand” a largenumber of possible different input text strings or their meanings;provided the input text can be reliably associated with one of theexpected transactions, it is necessary only to confirm whether all inputwords are correctly spelt and conform to an acceptable word order,without needing to know in detail the nuances of meaning that input textmay contain.

[0072] However, the use of a set of grammar rules and a vocabularydatabase in the embodiment, as discussed in greater detail below,enables the computer 20 to comprehend a much wider range of input textsthan prior art tutoring systems which are arranged to recognisepredetermined phrases.

[0073] Referring to FIG. 8, the store 206 contains the following data:

[0074] a lexical database 208 comprising a plurality of word records 208a, 208 b . . . 208 n each comprising:

[0075] the word itself, in the target language;

[0076] the syntactic category of the word (e.g. whether it is a noun, apronoun, a verb etc);

[0077] the values for a number of standard features of the word(specifically, the gender of the word, for example);

[0078] information (a symbol) relating to the meaning of the word; forexample, where the word is a noun or verb, the symbol may be itstranslation in the source language or where the word is another part ofspeech such as an article, data indicating whether it is the definite orindefinite article and whether it is singular or plural.

[0079] Also comprised within the store 206 is a rule database 210comprising a plurality (e.g. 44 in this embodiment) of rules 210 a, 210b . . . 210 n each specifying a rule of syntax structure of the targetlanguage and associated with a particular syntactic category. Forexample, the rule for a noun phrase will specify that it must comprise anoun and the associated article, whereas that for a verb phrasespecifies that it must include a verb and its associated complement(s),and may include a subject, with which the form of the verb must agree,and which may (together with the object of the verb) be one of severaldifferent syntactic categories (e.g. a noun, a noun phrase, a pronounand so on).

[0080] In general, then, then rules will specify which types of words(or clauses) must be present in which order, and with what agreements ofform, for a given semantic structure (e.g. a question).

[0081] In many target languages (for example French) agreement betweenthe form of words is necessary. Thus, where a noun or a pronoun has anassociated gender, then other parts of speech such as the definite orindefinite article, or the verb, associated with that noun or pronounmust have the same gender.

[0082] Likewise, where a noun or pronoun is associated with a number(indicating whether it is singular or plural) then the associateddefinite or indefinite article and/or verb must be singular or plural inagreement.

[0083] Other types of agreement may also be necessary, for example, toensure that a word is in the correct case or tense. The need for suchagreements is recorded in the relevant rules in the rules database.

[0084] A suitable semantic representation for the rules and words storedfor use in the above embodiments may be found in “Translation usingminimal recursion semantics” by A. Coopstake, D. Flickinger, R. Malouf,S. Riehemann, and I. Sag, to appear in proceedings of the 6thInternational Conference on Theoretical and Methodological Issues inMachine Translation (LEUVEN), currently available via the Internet athttp://hpsg.stanford.edu/hpsg/papers.html.

[0085] In order to detect simple errors, in this embodiment the rulesstored in the rules database 210 comprise, for at least some of therules, a first rule which specifies those agreements (for example ofgender and number) which are grammatically necessary for thecorresponding syntactic structure to be correct, but also a plurality ofrelaxed versions of the same rule, in each of which one or more of theagreement constraints is relaxed.

[0086] In other words, for a first rule 210 a which specifies correctagreement of both gender and number, there are associated relaxed rules210 b and 210 c, the first of which (210 b) corresponds but lacks therequirement for agreement of gender, and the second of which correspondsbut lacks the requirement for agreement of number.

[0087] Conveniently, the relaxed rules are stored following the correctrules with which they are associated.

[0088] Rather than permanently storing all inflections of each word inseparate word records 208 or storing all versions of the same wordwithin its word record 208, conveniently an inflection table 212 isprovided consisting of a plurality of inflection records, eachconsisting of a word stem and, for each of a predetermined plurality ofdifferent inflecting circumstances (such as cases, tenses and so on),the changes to the word endings of the stem.

[0089] Because many words exhibit identical inflection behaviour, thenumber of records 212 a, 212 b in the inflection table 212 issignificantly smaller than the number of lexical records 208 a . . . 208n in the lexical database 208. Each record in the lexical database 208contains a pointer to one of the records in the inflection table 212,and the relationship is usually many to one (that is, several wordsreference the same inflection model record in the inflection table 212).

[0090] Before each use, or period of use, of the host computer 20 theCPU 204 reads the lexical records 208, and expands the lexical recordstable 208 to included a new record for each inflected version of theword, using the inflection table 212.

[0091] After operation of the present invention ceases, the CPU 204correspondingly deletes all such additional entries. Thus, in periodswhen the invention is not in use, memory capacity within the computer 20is conserved.

[0092] Prior to expansion, the lexical table 208 in this embodimentcontains 265 records.

[0093] Specific information about the transactions making up the grocershop scenario is stored in a transaction table 214 consisting of anumber of entries 214 a, 214 b . . . 214 n.

[0094] The entries include information defining the items (e.g. apples)as being goods for sale, and defining units of measurement (e.g. kilos),and relating each kind of item to the units of measure in which it issold and the price per unit. Data is also stored associating each itemwith the item symbol and the graphics data representing the item (to beinitially transmitted to the terminal 10).

[0095] A response table 216 consists of a plurality of entries 216 a,216 b . . . each corresponding to one type of output control message 500generated by the computer 20, and storing, for that output, theanticipated types of response, ranked in decreasing order of likelihood.

[0096] For example, the likely responses to the opening message “Vousdésirez?” are, firstly, an attempt to buy produce; secondly, an attemptto enquire about produce (for example to ask the price).

[0097] On the other hand, the responses to the output “Et avec ca?”which follows a completed purchase include the above and additionallythe possibility of the end of the session, in which case a statementindicating that nothing more is sought is expected.

[0098] Likewise, if the last response was to supply price information,the next response could be an attempt to complete a transaction for thesubject of the enquiry, or could be a different enquiry, or an attemptto purchase something different, or an instruction to end the session.

[0099] Each entry in the response table also includes the associatedsource language response assistance text displayed in the text areas 314a and 314 c.

[0100] Each of the possible responses in the response table 216 containsa pointer to an entry in a syntactic category table 218, indicating whatsyntactic category the response from the user is likely to fall into;for example, if the last output text displayed in the text area 306 asks“How many would you like?”, the answer could be a sentence including averb (“I would like three kilos please”) or a noun phrase (“Threekilos”).

[0101] Finally, a buffer 220 of most recent system outputs is stored,storing the last, or the last few (e.g. two or three), system outputs ashigh level semantic structures. By reference to the system outputbuffer, it is therefore possible to determine to what the text input bythe user is an attempt to respond and hence, using the response table216, to assess the likeliest types of response, and (by reference to thesyntactic categories table 218) the likely syntactic form in which theanticipated responses will expressed.

[0102] Operation of the Host Computer 20

[0103] Referring to FIG. 9, the operation of the host computer in thisembodiment will now be described in greater detail.

[0104] Referring to FIG. 9a, in step 602, an attempt by a terminal 10 toaccess the computer 20 is detected.

[0105] In step 604, the CPU 204 accesses the stored file within thestore 206 storing the program to be downloaded and transmits the file(e.g. in the form of an Applet, for example in the Java (TM) programminglanguage) to the terminal 10.

[0106] In step 606, the CPU 204 reads the transaction data table 214 andtransmits, from each item record, the item image data file and the itemtype symbol.

[0107] The initial control message 500 sent in step 608 ispredetermined, and consists of the data shown in FIG. 4a (and describedabove in relation thereto) together with the stored text for display, ifrequired, in the fields 314 a and 314 c which is stored in the responsetable 216 in the entry relating to this opening system output.

[0108] Referring to FIG. 9b, in step 610, the host computer 20 awaits atext input from the terminal 10. On receipt, in step 611, if thelanguage permits contractions such as “l'orange”, the contraction isexpanded as a first step. Then, each word is compared with all thelexical entries in the table 208. Any word not present in these tablesis assumed to be a mis-spelling which may correspond to one or morevalid words; if a mis-spelling exists which could correspond to morethan one valid word (step 614) then a node is created in the input textprior to the mis-spelling and each possible corresponding valid word isrecorded as a new branch in the input text in place of the mis-speltword (step 616).

[0109] If the word is not recognised even after spell correction (step612) the word is retained and an indication of failure to recognise itis stored (step 613).

[0110] This process is repeated (step 620) until the end of the inputtext is reached (step 618).

[0111] If (step 622) any words were not recognised in steps 612, it willbe necessary to generate an output text indicating missing words andaccordingly the process of 204 proceeds to FIG. 9f (discussed below).Otherwise, at this stage, the input text consists entirely of wordsfound in the table 208, several of which may appear in severalalternative versions where a spelling error was detected, so as todefine, in such cases, a stored lattice of words branching before eachsuch mis-spelling into two or more alternative word paths.

[0112] The or each mis-spelling is stored prior to its replacement.

[0113] Referring to FIG. 9c, next, in step 624, each word is looked upin the word store 208 and each possible syntactic category for each word(e.g. noun, verb) is read out, to create for each word a list ofalternative forms defining more branches in the lattice of words (step626). The process is repeated (step 630) until the end of the input textis reached (step 628).

[0114] At this point, the processor 204 selects a first path through thelattice of words thus created and reads each of the rules in the rulestore 210 in turn, and compares the word path with each set of rules.

[0115] On each comparison, if the relationships between the propertiesof the words present corresponds to the relationships specified in therules, then the syntactic category associated with the rule in questionis detected as being present, and a syntactic structure, correspondingto that syntactic category and the words which are detected as making itup, is stored.

[0116] The CPU 204 applies the correct form of each rule (e.g. 210 a)which specifies the necessary agreements between all words making up thesyntactic category of the rule, and then in succession the relaxed formsof the same rule. When one of the forms of the rule is met, thesyntactic category which is the subject of the rule is deemed to bepresent, and a successful parse is recorded.

[0117] However, the CPU 204 additionally stores information on any errorencountered, by referring to the identity of the relaxed rule whichsuccessfully parsed the text; if the rule relaxes the gender agreementcriterion, for example, a gender agreement error is recorded as beingpresent between the words which were not in agreement.

[0118] The parse may pass twice (or more times) through the input text,since some rules may accept as their input the syntactic structuresgenerated in response to other rules (for example noun phrases and verbphrases).

[0119] If, after the parsing processing has concluded, it has beenpossible to parse the complete input text (step 636), the semanticstructure thus derived is stored (step 636) and the next word path isselected (step 640) until all word paths through the word lattice havebeen parsed (step 641).

[0120] Next, in step 644, the CPU 204 reads the output response buffer220, notes its previous output, and looks up the entry in the responsetable 214 associated with it. The response first read from the list isthat considered most likely to correspond to the last output.

[0121] Next, the CPU 204 accesses, for that response, the correspondingentry in the syntactic category table 218 (again, the first entryselected corresponds to that most likely to be found).

[0122] Next, in step 646 the or each semantic structure derived above asa result of the parse of the input text is compared (steps 648-652) withthe expected response syntactic category until a match is found.

[0123] The CPU 204 first reviews the parses performed by the strictforms of grammatical rules and, where a complete parse is stored basedon the strict rules (i.e. with no errors recorded as being present) thisis selected. Where no such parse exists, the CPU 204 then selects acomparison the or each parse including recorded errors, based on therelaxed forms of the rules.

[0124] At this point, in step 654, the CPU 204 ascertains whether thesemantic structure contains an action which could be performed. Forexample, the semantic structure may correspond to:

[0125] a question which can be answered, or

[0126] a request for a sale transaction which can be met, or

[0127] an indication that a series of one or more sale transactions isnow complete, in which case a price total can be calculated andindicated.

[0128] In the first of these cases, the input semantic structure needsto correspond to a question and needs to mention the type of item ofwhich the price is being asked (in this embodiment price represents theonly datum stored in relation to each transaction, but in general otherproperties could be questioned).

[0129] In the second case, the input statement needs to specify a kindof item to be sold and a quantity which is valid for that kind of goods(e.g. “apples” and “three kilos”). It may be phrased as a sentence inthe target language (“I would like three kilos of apples”) or as aquestion (“Could I have three kilos of apples?”) or as a noun phrase(“Three kilos of apples”).

[0130] In the last case, the input text could take a number of forms,ranging from a word to a sentence.

[0131] If the input text does not obviously correspond to any actionwould could be carried out, further comparisons are attempted (the CPU204 returns to step 652) and if no possible action is ultimatelydetermined, (or if one or more words are not recognised in step 612above) then the CPU 204 determines that the input text cannot beunderstood (step 656).

[0132] If, on the other hand, all the information necessary to carry outan action (complete a purchase, answer a question etc.) is present thenthe CPU 204 selects that action for performance (step 658).

[0133] Finally, if it is possible to determine the nature of the actionto be performed but not to perform it, then the CPU 204 formulates (step660) a query to elucidate the missing information for the performance ofthe action.

[0134] For instance, if the input text is (in the target language) “Iwould like to buy some apples”, the CPU 204 determines that the intendedaction is to purchase apples, accesses the record for apples in thetransaction table 214; and notes that the quantity information ismissing.

[0135] In each case, the CPU 204 is arranged to derive output text, userguidance text and an indication of suitable images for display, fortransmission to the terminal 10.

[0136] Where unrecognised words have caused the missing text not beunderstood, the CPU 204 generates user guidance text (step 666)indicating to the user the words which have not been understood andprompting the user for replacements. In step 668, output text (in thetarget language) is generated indicating that the grocer cannotunderstand the words concerned.

[0137] The same process is performed where (step 656) the input text wasnot understood for other reasons, except that the output text and userguidance texts refer to general misunderstanding rather than specificwords.

[0138] Error Present

[0139] In the event that an action has been fully or partly possible,the semantic structure corresponding to the action to be undertaken (forexample indicating that three kilograms of apples are to be sold, orthat a question is to be asked requesting the quantity of apples) isstored in the output buffer 220.

[0140] In the event that an action has been fully or partly possible,then in step 662 the CPU 204 determines whether spelling or grammaticalerrors were entered. If so, then in step 664, the CPU 204 selectscomprehension text consisting of one or both of the pre-stored targetlanguage phrases “Erreur d'orthographe!” or “Erreur de grammaire!”) fortransmission in the comprehension text field 508 and display in thecomprehension text area 308.

[0141] At the same time, the CPU generates source language help text fortransmission in the user guidance text field 514 and display in the userguidance area 314 b. Where the error is a spelling mistake, the textcomprises, in the source language, the words “What the tutor thinks youdid wrong is . . . I think you made a spelling mistake, (stored inputword) should be (word with which it was replaced in the successfulparse)”.

[0142] Where the error is a grammatical error, the CPU determines whichrule failed to be met, and thereby determines whether the error was anerror of gender or number, or an error of subject/verb agreement.

[0143] The text then generated is “What the tutor thinks you did wrongis . . . I think you made a grammatical mistake, try checking you haveused the right (gender, number or verb form)”.

[0144] Next, in step 666 the CPU 204 selects the text to be output forthe user guidance text areas 314 a and 314 c. The text for the area 314a is obtained by looking up the stored last output in the buffer 220 andaccessing the text stored in the corresponding record 216 for thatoutput. This text describes the response selected in step 658 or thequery formulated in step 660; for example, where the action of supply ofgoods has been successfully completed (step 658) the text in field 314 awill read (in the source language) “What the shop keeper has just saidis . . . The shop keeper has supplied your goods, and is waiting for youto give him a new instruction.”

[0145] The text in the field 314 c offers the user logical responseoptions, and is obtained by looking up the text stored with theanticipated responses in the field within the table 216 which relates tothe action or query just generated in step 658 or 660 and stored in thebuffer 220.

[0146] Finally, in step 668, the output text field 506 to be sent in themessage 500 and displayed in the output text area 306 is generated.

[0147] The generation could take the form of simple selections ofcorresponding text, as in the above described text generation stages,but it is preferred in this embodiment to generate the output text in afreer format, since this is likely to lead to greater variability of theresponses experienced by the user and lower memory requirements.

[0148] To achieve this, the CPU 204 utilises the rules stored in therule table 210 and the words stored in the lexicon 208 to generate textfrom the high level response generated in steps 658 or 660. In general,the process is the reverse of the parsing process described above, butsimpler since the process starts from a known and deterministic semanticstructure rather than an unknown string of text.

[0149] The first stage, as shown in FIG. 9f, is to select from thelexicon table 208 a subset of words which could be used in the outputtext. In a step 6681, the CPU 204 reviews the first term in the semanticstructure generated in step 658 or 660. In a step 6682, the CPU 204looks up, in the lexical table 208, each word the record of which beginswith that term.

[0150] In step 6683, the CPU 204 compares the record for the word withthe output semantic structure. If all other terms required by the wordare present in the output semantic structure, then in step 6684 the wordis stored for possible use in text generation; if not, the next wordbeginning with that term is selected (step 6685).

[0151] When the last word is reached (step 6686), the next term isselected (step 6687) and the process is repeated until the last term isreached (step 6688), at which point all words which could contribute tothe generation of the output text have been stored.

[0152] Next, in step 6689, the CPU 204 accesses the rules table 210 andapplies the rules relating to the stored terms of the output semanticstructure to the words selected in the preceding steps to generateoutput text.

[0153] Thus, where the quantity of apples required is to be queried, thesemantic structure includes a term specifying a query; a term specifyingthat the subject of the query is quantity; and a term specifying thatthe object of the query is that which an attempt was previously made topurchase; namely apples.

[0154] The words selected in steps 6681-6888 consist of the word for“apples” in the target language; and the query word or phrase whichspecifies quantity. Application of the rules for construction of a querythen leads to the generation of a grammatically correctly wordedquestion.

[0155] Returning to FIG. 9d, in step 670 the CPU 204 transmits thecontrol message 500 formed by the above steps to the terminal 10. TheCPU 204 then returns to step 610 of FIG. 9b to await the next receivedinput text.

[0156] Other Embodiments and Modifications

[0157] In the foregoing, for clarity, the operations of the embodimenthave been described in general terms, without specifying in detail thesteps which are performed by separate programme components. In aconvenient implementation, however, the applet program would control allimage displaying operations, and image data would be supplied by theserver program on the host computer 20, rather than by the applicationprogram performing the semantic processing.

[0158] In the foregoing embodiments, conveniently, the semanticprocessing performed on the host processor 20 may be written in theProlog language, and the parsing may be performed by Prologbacktracking.

[0159] It will, however, be recognised that the invention could beimplemented using any convenient hardware and/or software techniquesother than those described above.

[0160] Equally, whilst a language training program has been described,it will be recognised that the invention is applicable to other types oftraining in which it is desired to emulate the interaction of a userwith another person.

[0161] Further, it will be apparent that the terminal 10 and computer 20could be located in different jurisdictions, or that parts of theinvention could further be separated into different jurisdictionsconnected by appropriate communication means. Accordingly, the presentinvention extends to any and all inventive subcomponents andsubcombinations of the above described embodiments located within thejurisdiction hereof.

[0162] In the above described embodiments, text input and output havebeen described. However, in a further embodiment, the terminal 10 may bearranged to accept input speech via a microphone and transmit the speechas a sound file to the computer 10, which is correspondingly arranged toapply a speech recognition algorithm to determine the words present inthe input.

[0163] Together, or separately, the output text generated by the grocermay be synthesised speech, and accordingly in this embodiment thecomputer 10 comprises a text to speech synthesizer arranged to generatea sound file transmitted to the terminal 10. In either such case, asuitable browser program other than the above described Netscape (TM)browser is employed.

[0164] Other forms of input and output (for example, handwritingrecognition input) could equally be used.

[0165] Although in the preceding embodiments the redisplay of the headportion of the grocer image has been described, it will be apparent thatit may be more convenient simply to redisplay the entire image of thegrocer in other embodiments.

[0166] It will be apparent that the transactions described above neednot be those of a grocer shop. The scenario could, for example, involvea clothes shop (in which case the articles sold would comprise items ofclothing) or a butcher's shop (in which the case the items sold wouldcomprise cuts of meat). Equally, other forms of training than foreignlanguage training could be involved, in which case the scenarios couldinvolve familiarity in the source language with scenarios such asemergency or military procedures.

[0167] Accordingly, the invention is not limited by the above describedembodiments but extends to any and all such modifications andalternatives which are apparent to the skilled reader hereof.

1. Training apparatus for training a user to engage in transactions withanother person whom the apparatus is arranged to simulate, the apparatuscomprising: an input for receiving input dialogue from a user; a lexicalstore containing data relating to individual words of said inputdialogue; a rule store containing rules specifying grammaticallyallowable relationships between words of said input dialogue; atransaction store containing data relating to allowable transactionsbetween said user and said person; a processor arranged to process theinput dialogue to recognise the occurrence therein of words contained insaid lexical store in the relationships specified by the rules containedin said rule store in accordance with the data specified in thetransaction store, and, in dependence upon said recognition, to generateoutput dialogue indicating when correct input dialogue has beenrecognised; and an output device for making the output dialogueavailable to the user.
 2. Apparatus according to claim 1, in which saidrule store contains first rules comprising criteria specifying correctrelationships between words of said word store, and, associated withsaid first rules, one or more second rules each corresponding to a saidfirst rule but with one relationship criterion relaxed.
 3. Apparatusaccording to claim 2, wherein said relationship criteria correspond toagreements between words (for example, agreements of gender or number).4. Apparatus according to any preceding claim, in which the processor isarranged to generated output dialogue responsive to input dialogue, andto detect recognised errors in said input dialogue, and, on detectionthereof, to indicate said recognised errors separately of saidresponsive output dialogue.
 5. Apparatus according to claim 4 whenappended to claim 2 or claim 3, in which said processor is arranged todetect said recognised errors on detection of input dialogue containingwords which meet said second, but not said first, rules.
 6. Apparatusaccording to any preceding claim which is arranged to provide languagetraining, in which said rules, said words, and said output dialogue arein a training target language, and further arranged to generate userguidance dialogue in a source language for said user and different tosaid target language.
 7. Apparatus according to claim 6 in which theuser guidance dialogue comprises guidance as to the meaning of theoutput dialogue.
 8. Apparatus according to claim 6 or claim 7 in whichthe user guidance dialogue comprises an explanation of any detectederrors in the input dialogue.
 9. Apparatus according to any of claims 6to 8, in which the user guidance dialogue indicates suitable furtherinput dialogue which could be provided.
 10. Apparatus according to anypreceding claim in which said input dialogue and/or said output dialoguecomprise text.
 11. Apparatus according to any of claims 1 to 9 in whichsaid input dialogue comprises speech, and further comprising a speechrecogniser arranged to recognise the words of said speech.
 12. Apparatusaccording to any claims 1 to 9 in which said output dialogue comprisesspeech, said apparatus further comprising a speech synthesizer. 13.Apparatus according to any preceding claim, further comprising a userinterface arranged to accept said input dialogue and make available saidoutput dialogue to the user.
 14. Apparatus according to claim 13, inwhich said user interface comprises a display and in which said outputdialogue is displayed on said display.
 15. Apparatus according to claim14 when appended to any of claims 6 to 9, in which said user guidancetext is normally not displayed on said display, and further comprisingan input device via which a user may selectively cause the display ofsaid user guidance text on said display.
 16. Apparatus according to anyof claims 13 to 15, in which said user interface is located remotelyfrom said processor and is coupled thereto via a communications channel.17. Language training apparatus comprising a processor arranged toaccept input dialogue in the target language, to detect recognisederrors in said input dialogue, to generate responsive output dialogue inthe target language and, when a said recognised error is detected, togenerate a separate indication of the presence of said recognised error.18. Apparatus according to claim 17 in which the separate indication ofthe recognised error is an indication in the target language. 19.Apparatus according to claim 17 or claim 18 in which the separateindication comprises explanatory text in a source language of the userdifferent to said target language.